By: Ari Sen
Across the country millions of students are returning to campuses or setting foot on them for the very first time. This new period in students’ lives brings lots of emotions — excitement and anxiety, fear and joy, connection and loneliness.
For the past decade many have turned to social media accounts to express these feelings to their circle of friends, family and peers. But this trusted group may not be the only ones watching — across the country dozens of colleges have purchased a technology called Social Sentinel, in what they say is an attempt to keep the worst from happening.This number is also very likely to be an undercount — an email we obtained from one school, says the service is used by "hundreds" of colleges and universities in 36 states across the country. Still, by comparison, this number is small when compared to the "thousands" of K12 schools where the same technology is used.
Perhaps because of their comparatively lower numbers, very little attention has been paid to college campuses that use this technology. But, in my view, this is a mistake I hope to address with this story.
Unlike at the K12 level, where alerts are often sent to mental health counselors or school administrators, at the college level Social Sentinel’s alerts are typically sent to campus police officers. This is interesting for two reasons.
The first is, unlike regular police which answer to a chief which answers to a mayor or city council, campus police are essentially totally accountable to the population they serve. A college student usually gets no vote on who the chancellor or president of their university will be, nor can they usually argue against a police action before it takes place.
The second is that campus police were, quite literally, created to suppress student activism. Although the first campus police were established in 1894 at Yale, most other universities didn’t follow suit until the late 1960s and early 1970s; According to historians, these departments were largely formed to quash student protests against the ongoing war in Vietnam.
My hypotheses are as follows:
In this project I will mainly be focusing on the second hypothesis.
My data is a collection of nearly 1236 tweets, nearly 400 of which were flagged by Social Sentinel as potential threats. The flagged tweets were gathered by Peter Aldhous and Lam Vo for their 2019 story Your Dumb Tweets Are Getting Flagged To People Trying To Stop School Shootings. The unflagged tweets were scraped from Twitter using the Twint library in Python. These tweets were gathered from the same users in the same time period as the flagged tweets, plus and minus a week for those with only one flagged tweet.
To support my reporting hypotheses, I generated embeddings for every tweet using BERTweet and the clustered them using k-means, with a k of 2. I then compared these embeddings to the labels Social Sentinel assigned to the tweets and to my human-annotaed labels of whether the tweet was threatening or not.
To generate visualizations, I used TSNE to reduce the diminsionality of the BERTweet embeddings to two and plotted each on an x, y coordinate plane.
I colored the plots:
The plots are reproduced below:
I also used topic modeling to investigate the salience of groups of words in the corpus:
The results from the clustering suggest that neither my system, nor the method Social Sentinel is very accurate. The company's model did perform better on this metric though, scoring 0.696 vs my 0.574, when compared to the human annotated labels.
However, my method has far fewer false positives AKA much higher precision: 0.065 for Social Sentinel vs 0.310 for my system.
This should be a concerning finding for the company and any school that uses this technology, given that I built my system in only a few hours and had far less training data to work with for this task.
Social Sentinel has repeatedly claimed that they have significantly reduced their false positives:
My analysis suggests that claims are at best dubious and at worst outright fabrications. The system, as evalauated here, is thus likely to be a waste of both money and precious policing and mental health resources.
This reporting also raises the question: If it isn't threats of suicide and shootings they are surfacing, what is it that they are catching?
My reporting so far suggests that the answer may be protests and student activism.